Skip to content

Make testTeardownSequence() resilient to scheduling latency#300

Merged
iCharlesHu merged 1 commit into
swiftlang:mainfrom
broken-circle:testTeardownSequence-determinism
Jun 9, 2026
Merged

Make testTeardownSequence() resilient to scheduling latency#300
iCharlesHu merged 1 commit into
swiftlang:mainfrom
broken-circle:testTeardownSequence-determinism

Conversation

@broken-circle

@broken-circle broken-circle commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

testTeardownSequence() failed on a CI run for an unrelated Windows-only change in #296:

✘ Test testTeardownSequence() recorded an issue at UnixTests.swift:233:21: Expectation failed: outputs == ["saw SIGQUIT", "saw SIGTERM", "saw SIGINT"]
↳ outputs == ["saw SIGQUIT", "saw SIGTERM", "saw SIGINT"] → false
↳   outputs → ["saw SIGQUIT", "saw SIGTERM"]
✘ Test testTeardownSequence() recorded an issue at UnixTests.swift:238:9: Expectation failed: result.terminationStatus == .exited(42)
↳ result.terminationStatus == .exited(42) → false
↳   result.terminationStatus → signaled(9)
↳     signaled → 9
↳   .exited(42) → exited(42)
↳     exited → 42

The test drove a teardown(using:) sequence (SIGQUIT, SIGTERM, then SIGINT, each with a grace period, ending in an implicit SIGKILL) against a bash child whose INT trap echoes and exit 42s, and which otherwise looped in a foreground sleep 0.1. On a loaded Linux runner it failed with signaled(9) and a transcript missing the final saw SIGINT.

Bash defers a trapped signal until the running foreground command completes. When scheduling pressure delayed the deferred INT trap past the grace period, the implicit SIGKILL reached the child before its deferred INT trap ran, so it died on the signal instead of exiting 42. Only the SIGINT step gates anything; the QUIT and TERM traps just echo, so those steps always run their full duration regardless of timing, which is why the sequence got that far and no further.

This PR idles in wait on a backgrounded sleep instead. A trapped signal interrupts wait immediately and runs the handler at once, with no dependence on a sleep interval elapsing, so the INT trap fires well inside the grace period even under load. The backgrounded sleep is short, so a signal that lands as bash enters the wait is serviced at the next safe point within one interval, rather than blocking on a long-lived child.

I ran the pre-fix test on GitHub Actions under heavy CPU oversubscription on the same amazonlinux2 nightly image, but wasn't able to reproduce the failure across 500 iterations. The failure is load-dependent so it's possible I didn't recreate the exact scenario, or this reproduces too rarely to surface in 500 runs.

The test drove a `teardown(using:)` sequence (`SIGQUIT`, `SIGTERM`, then
`SIGINT`, each with a grace period, ending in an implicit `SIGKILL`)
against a bash child whose `INT` trap echoes and `exit 42`s, and which
otherwise looped in a foreground `sleep 0.1`. On a loaded Linux runner
it failed with `signaled(9)` and a transcript missing the final
`saw SIGINT`.

Bash defers a trapped signal until the running foreground command
completes. When scheduling pressure delayed the deferred `INT` trap
past the grace period, the implicit `SIGKILL` reached the child before
its deferred `INT` trap ran, so it died on the signal instead of exiting
`42`. Only the `SIGINT` step gates anything; the `QUIT` and `TERM` traps
just echo, so those steps always run their full duration regardless of
timing, which is why the sequence got that far and no further.

Idle in `wait` on a backgrounded `sleep` instead. A trapped signal
interrupts `wait` immediately and runs the handler at once, with no
dependence on a sleep interval elapsing, so the `INT` trap fires well
inside the grace period even under load. The backgrounded `sleep` is
short so a signal that lands as bash enters the `wait` is serviced at
the next safe point within one interval, rather than blocking on a
long-lived child.
@iCharlesHu iCharlesHu merged commit e38733b into swiftlang:main Jun 9, 2026
45 of 46 checks passed
@broken-circle broken-circle deleted the testTeardownSequence-determinism branch June 9, 2026 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants